

SEQUENCE LISTING 



) GENERAXiNFORM^ION : 

(i) APPLICANT: Zavada, Jan 

Pastorekova, Silvia 
Pastor ek, Jaromir 

(ii) TITLE OF INVENTION: MN Gene and Protein 
(iii) NUMBER OF SEQUENCES: 86 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Leona L. Lauder 

(B) STREET: 465 California Street, Suite 450 

(C) CITY: San Francisco 

(D) STATE: California 

(E) COUNTRY: USA 

(F) ZIP : 94104 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE : Patent In Release #1.0, Version #1.3 0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US 09/772,719 

(B) FILING DATE: 30-JAN-2001 

(C) CLASSIFICATION: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 08/485,049 

(B) FILING DATE: 07-JUN-1995 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Lauder, Leona L. 

(B) REGISTRATION NUMBER: 30,863 

(C) REFERENCE/DOCKET NUMBER: D-0021.3A-2 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 415-981-2034 

(B) TELEFAX: 415-981-0332 

(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1522 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(iii) HYPOTHETICAL: NO 



*(iv) ANTI- SENSE: NO 
(xi) SEQUENCE OBSCBIWK*. SE Q IB UO: ^ 

CCT G CT CCAG OCC^O, GCAACTGCTG ^ , 

— -cc^ == - : 

" rr: rr G — c — ^ 

CCACCCGGAG AGGAGGATCT TAGAGGATCT ACCTACTGTT 

GAAGTTAAGC CTAAATCAGA AGAAGAGGGC TCC 

„ f araTPCTCA AGAACCCCAG AATAATGCCC ACA 
GAGGCTCCTG GAGATCCTCA C(;GCCCTGGC CCCGGGTGTC CCCAGCCTGC 

GACCAGAGTC ATTGGCGCTA TGGAGGCGAC CCGCCCT ^ 
GCGGGCCGCT T CCAG T CCCC GGTGGATATC CGCCCC - 

^ GCAGCTGCAT CTGCACTGGG GGGCTGCAGG TCGTCCGGGC 
GGGCGGGAGT ACCGGGCTC ^ TCACCTCAGC 

TCGGAGCACA CTGTGGAA&3 

ACCGCCTTTG CCAGAGTTGA CGAGGCCTT ^ 
GCCTTTCTGG AGGAGGGCCC GGAAGAAAAC G- 

GAAGAAATCG CTGAGGAAGG CTCAGAGACT CAGGXCCCA ^ 

(-it a r'TTPPAA TATGAGGGGT CTCTGACTAU ^ 
CTGCCCTCTG ACTTCAGCCG CTACTTCCAA ^ 

o fpr"rp TTTAAC CAGACAGTGA TGCTfcA^ 
GCCCAGGGTG TCATCTGGAC TGTGTTTAAC c^CTTCCGA 

— ^r, ^L. — 

■— °™™™ — — 

AGTCCTCGGG CTGCTGAGCC AGTCCAGCT rvrTCCTTGT GCAGATGAGA 

TTTTGCTGTC ACCAGCGTCG CGTTCCTTil 
GCCCTGGTTT TTGGCCTCCT TTTTGCTG qpcCAGCAGA GGTAGCCGAG 



TTTTAAAATA AATATTTATA AT 



'INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 459 ammo acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

represent mature protein 
(xl ) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Pro Ser Pro Trp Leu Pro Leu Leu lie Pro Ala 



Met Ala Pro Leu Cys Pro ber ~t> — _ 25 

Pro Ala 2 Gly - Tnr Val Gin Leu Leu Leu Ser Leu Leu Leu Leu 
■20 - 15 



Met Pro Val His Pro Gin Arg Leu Pro fg Met Gin Glu Asp Ser Pro 
l Gly Gly Gly Ser L Oly «» « *P « « ^ £» ^ « 
L eu Pro Ser I Glu « Ser Pro Arg Glu Glu Asp Pro Pro Gly Glu 

nv Glu Glu Asp Leu Pro Gly Glu Glu Asp Leu Pro 
Glu Asp Leu Pro Gly Glu biu 55 

q^r Leu Lys Leu Glu Asp 
Glu val Lys Pro Lys Ser Glu Glu Glu Gly Ser y 

Leu Pro Th r val Glu Ma Pro Gly >sp « «- Glu - Gin « « 

A la His A r 9 ASP I Glu G1 y « « - ser His Trp «, TV* Gly 

95 

« i cr Pro Ala Cys Ala Gly Arg Phe 
Gly Asp Pro Pro Trp Pro Arg Val Ser Pro Ala ^ 

Gln ser Z Val « XL ^ « ^ ^ f 3 5 * ^ ^ ^ 

Leu Tg Pro Leu Glu Leu Leu Gly Pne Gin Leu Pro Pro Leu Pro Glu 

j_4 b 

* riv His Ser Val Gin Leu Thr Leu Pro Pro 
Leu Arg Leu Arg Asn Asn Gly His Ser 170 

160 



Gly Le u Glu Met Ma Leu Gly Pro Gly « g Olu Tyr A r 9 Ma Leu Gin 

L eu His Trp Gly Ma Ma Gly *S P» Gly Ser Glu His Thr 

val Glu Gly His *r 9 Phe Pro Ma Giu Xle His Val Val His Leu Ser 

Thr Z Phe Ma Ar g val A sp Glu Ma Leu Gly « g Pro Gly Gly Leu 
220 225 

m,-, rw Pro Glu Glu Asn Ser Ala 
Ala Val Leu Ala Ala Phe Leu Glu Glu Gly Pro ^ 

240 z 
^ Glu Gin Leu Leu Ser Ar 3 Leu Glu Glu He Ma Glu Glu Gly Ser 

Q1 u Thr Gin 2 Pro Gly Leu *sp He Ser Ma Leu Leu Pro Ser *sp 

P he Ser Z Tyr Phe Gin Tyr Glu Gly Ser Leu Thr Thr Pro Pro Cys 

Z Gly val He Trp Thr Val Phe ,sn Gin Thr Val Met Leu Ser 

Z Lys Gin Leu His 2 Leu Ser ,sp Thr Leu Trp Gly Pro Gly *sp 

320 

ser M9 Leu Gin Leu A sn Phe *r g Ma Thr Gin Pro Leu *sn Gly M 3 
val „. Glu Ta ser Phe Pro Ma Gly Val *sp Ser Ser Pro R r g Ma 

Ala Olu Pro Val Gin Leu *.„ Ser Cys Leu Ma Ma Gly *sp He Leu 

365 3 /U 

Leu val Phe Gly Leu Leu Phe Ma Val Thr Ser Val Ma Phe Leu 

Z Gin Met A r 9 *r 9 Z His *r 9 *r 9 Gly Thr Lys Gly Gly Val Ser 

400 

Tyr Arg Pro Ala Glu Val Ala Glu Thr Gly Ala 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 29 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



' (ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

29 

CGCCCAGTGG GTCATCTTCC CCAGAAGAG 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: YES 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

19 

GGAATCCTCC TGCATCCGG 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10898 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5: 



GGATCCTGTT 


GACTCGTGAC 


CTTACCCCCA 


ACCCTGTGCT 


CTCTGAAACA 


TGAGCTGTGT 


60 


CCACTCAGGG 


TTAAATGGAT 


TAAGGGCGGT 


GCAAGATGTG 


CTTTGTTAAA 


CAGATGCTTG 


120 


AAGGCAGCAT 


GCTCGTTAAG 


AGTCATCACC 


AATCCCTAAT 


CTCAAGTAAT 


CAGGGACACA 


180 


AACACTGCGG 


AAGGCCGCAG 


GGTCCTCTGC 


CTAGGAAAAC 


CAGAGACCTT 


TGTTCACTTG 


240 


TTTATCTGAC 


CTTCCCTCCA 


CTATTGTCCA 


TGACCCTGCC 


AAATCCCCCT 


CTGTGAGAAA 


300 


CACCCAAGAA 


TTATCAATAA AAAAATAAAT 


TTAAAAAAAA 


AATACAAAAA AAAAAAAAAA 


360 



aaaaaaaaaa gacttacgaa tagttattga TAAATGAATA GCTATTGGTA AAGCCAAGTA 420 
AATGATCATA TTCAAAACCA GACGGCCATC ATCACAGCTC AAGTCTACCT GATTTGATCT 480 
CTTTATCATT GTCATTCTTT GGATTCACTA GATTAGTCAT CATCCTCAAA ATTCTCCCCC 
AAGTTCTAAT TACGTTCCAA ACATTTAGGG GTTACATGAA GCTTGAACCT ACTACCTTCT 
TTGCTTTTGA GCCATGAGTT GTAGGAATGA TGAGTTTACA CCTTACATGC TGGGGATTAA 
TTTAAACTTT ACCTCTAAGT CAGTTGGGTA GCCTTTGGCT TATTTTTGTA GCTAATTTTG 
TAGTTAATGG ATGCACTGTG AATCTTGCTA TGATAGTTTT CCTCCACACT TTGCCACTAG 
GGGTAGGTAG GTACTCAGTT TTCAGTAATT GCTTACCTAA GACCCTAAGC CCTATTTCTC 
TTGTACTGGC CTTTATCTGT AATATGGGCA TATTTAATAC AATATAATTT TTGGAGTTTT 
TTTGTTTGTT TGTTTGTTTG TTTTTTTGAG ACGGAGTCTT GCATCTGTCA TGCCCAGGCT 
GGAGTAGCAG TGGTGCCATC TCGGCTCACT GCAAGCTCCA CCTCCCGAGT TCACGCCATT 
TTCCTGCCTC AGCCTCCCGA GTAGCTGGGA CTACAGGCGC CCGCCACCAT GCCCGGCTAA 
TTTTTTGTAT TTTTGGTAGA GACGGGGTTT CACCGTGTTA GCCAGAATGG TCTCGATCTC 
CTGACTTCGT GATCCACCCG CCTCGGCCTC CCAAAGTTCT GGGATTACAG GTGTGAGCCA 
CCGCACCTGG CCAATTTTTT GAGTCTTTTA AAGTAAAAAT ATGTCTTGTA AGCTGGTAAC 
TATGGTACAT TTCCTTTTAT TAATGTGGTG CTGACGGTCA TATAGGTTCT TTTGAGTTTG 
GCATGCATAT GCTACTTTTT GCAGTCCTTT CATTACATTT TTCTCTCTTC ATTTGAAGAG 
CATGTTATAT CTTTTAGCTT CACTTGGCTT AAAAGGTTCT CTCATTAGCC TAACACAGTG 
TCATTGTTGG TACCACTTGG ATCATAAGTG GAAAAACAGT CAAGAAATTG CACAGTAATA 
CTTGTTTGTA AGAGGGATGA TTCAGGTGAA TCTGACACTA AGAAACTCCC CTACCTGAGG 
TCTGAGATTC CTCTGACATT GCTGTATATA GGCTTTTCCT TTGACAGCCT GTGACTGCGG 
ACTATTTTTC TTAAGCAAGA TATGCTAAAG TTTTGTGAGC CTTTTTCCAG AGAGAGGTCT 
CATATCTGCA TCAAGTGAGA ACATATAATG TCTGCATGTT TCCATATTTC AGGAATGTTT 
GCTTGTGTTT TATGCTTTTA TATAGACAGG GAAACTTGTT CCTCAGTGAC CCAAAAGAGG 
TGGGAATTGT TATTGGATAT CATCATTGGC CCACGCTTTC TGACCTTGGA AACAATTAAG 
GGTTCATAAT CTCAATTCTG TCAGAATTGG TACAAGAAAT AGCTGCTATG TTTCTTGACA 
TTCCACTTGG TAGGAAATAA GAATGTGAAA CTCTTCAGTT GGTGTGTGTC CCTNGTTTTT 
TTGCAATTTC CTTCTTACTG TGTTAAAAAA AAGTATGATC TTGCTCTGAG AGGTGAGGCA 



540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 



TTCTTAATCA TGATCTTTAA AGATCAATAA TATAATCCTT TCAAGGATTA TGTCTTTATT 2100 
ATAATAAAGA TAATTTGTCT TTAACAGAAT CAATAATATA ATCCCTTAAA GGATTATATC 2160 
TTTGCTGGGC GCAGTGGCTC ACACCTGTAA TCCCAGCACT TTGGGTGGCC AAGGTGGAAG 
GATCAAATTT GCCTACTTCT ATATTATCTT CTAAAGCAGA ATTCATCTCT CTTCCCTCAA 
TATGATGATA TTGACAGGGT TTGCCCTCAC TCACTAGATT GTGAGCTCCT GCTCAGGGCA 
GGTAGCGTTT TTTGTTTTTG TTTTTGTTTT TCTTTTTTGA GACAGGGTCT TGCTCTGTCA 
CCCAGGCCAG AGTGCAATGG TACAGTCTCA GCTCACTGCA GCCTCAACCG CCTCGGCTCA 
AACCATCATC CCATTTCAGC CTCCTGAGTA GCTGGGACTA CAGGCACATG CCATTACACC 
TGGCTAATTT TTTTGTATTT CTAGTAGAGA CAGGGTTTGG CCATGTTGCC CGGGCTGGTC 
TCGAACTCCT GGACTCAAGC AATCCACCCA CCTCAGCCTC CCAAAATGAG GGACCGTGTC 
TTATTCATTT CCATGTCCCT AGTCCATAGC CCAGTGCTGG ACCTATGGTA GTACTAAATA 
AATATTTGTT GAATGCAATA GTAAATAGCA TTTCAGGGAG CAAGAACTAG ATTAACAAAG 
GTGGTAAAAG GTTTGGAGAA AAAAATAATA GTTTAATTTG GCTAGAGTAT GAGGGAGAGT 
AGTAGGAGAC AAGATGGAAA GGTCTCTTGG GCAAGGTTTT GAAGGAAGTT GGAAGTCAGA 
AGTACACAAT GTGCATATCG TGGCAGGCAG TGGGGAGCCA ATGAAGGCTT TTGAGCAGGA 
GAGTAATGTG TTGAAAAATA AATATAGGTT AAACCTATCA GAGCCCCTCT GACACATACA 
CTTGCTTTTC ATTCAAGCTC AAGTTTGTCT CCCACATACC CATTACTTAA CTCACCCTCG 
GGCTCCCCTA GCAGCCTGCC CTACCTCTTT ACCTGCTTCC TGGTGGAGTC AGGGATGTAT 
ACATGAGCTG CTTTCCCTCT CAGCCAGAGG ACATGGGGGG CCCCAGCTCC CCTGCCTTTC 
CCCTTCTGTG CCTGGAGCTG GGAAGCAGGC CAGGGTTAGC TGAGGCTGGC TGGCAAGCAG 
CTGGGTGGTG CCAGGGAGAG CCTGCATAGT GCCAGGTGGT GCCTTGGGTT CCAAGCTAGT 
CCATGGCCCC GATAACCTTC TGCCTGTGCA CACACCTGCC GCTCACTGCA CCCCCATCCT 
AGCTTTGGTA TGGGGGAGAG GGCACAGGGC CAGACAAACC TGTGAGACTT TGGCTCCATC 
TCTGCAAAAG GGCGCTCTGT GAGTCAGCCT GCTCCCCTCC AGGCTTGCTC CTCCCCCACC 
CAGCTCTCGT TTCCAATGCA CGTACAGCCC GTACACACCG TGTGCTGGGA CACCCCACAG 
TCAGCCGCAT GGCTCCCCTG TGCCCCAGCC CCTGGCTCCC TCTGTTGATC CCGGCCCCTG 
CTCCAGGCCT CACTGTGCAA CTGCTGCTGT CACTGCTGCT TCTGGTGCCT GTCCATCCCC 
AGAGGTTGCC CCGGATGCAG GAGGATTCCC CCTTGGGAGG AGGCTCTTCT GGGGAAGATG 



2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 



acccactggg cgaggaggat ctgcccagtg aagaggattc acccagagag gaggatccac 

CCGGAGAGGA GGATCTACCT GGAGAGGAGG ATCTACCTGG AGAGGAGGAT CTACCTGAAG 
TTAAGCCTAA ATCAGAAGAA GAGGGCTCCC TGAAGTTAGA GGATCTACCT ACTGTTGAGG 
CTCCTGGAGA TCCTCAAGAA CCCCAGAATA ATGCCCACAG GGACAAAGAA GGTAAGTGGT 
CATCAATCTC CAAATCCAGG TTCCAGGAGG TTCATGACTC CCCTCCCATA CCCCAGCCTA 
GGCTCTGTTC ACTCAGGGAA GGAGGGGAGA CTGTACTCCC CACAGAAGCC CTTCCAGAGG 
TCCCATACCA ATATCCCCAT CCCCACTCTC GGAGGTAGAA AGGGACAGAT GTGGAGAGAA 
AATAAAAAGG GTGCAAAAGG AGAGAGGTGA GCTGGATGAG ATGGGAGAGA AGGGGGAGGC 
TGGAGAAGAG AAAGGGATGA GAACTGCAGA TGAGAGAAAA AATGTGCAGA CAGAGGAAAA 
AAATAGGTGG AGAAGGAGAG TCAGAGAGTT TGAGGGGAAG AGAAAAGGAA AGCTTGGGAG 
GTGAAGTGGG TACCAGAGAC AAGCAAGAAG AGCTGGTAGA AGTCATCTCA TCTTAGGCTA 
CAATGAGGAA TTGAGACCTA GGAAGAAGGG ACACAGCAGG TAGAGAAACG TGGCTTCTTG 
ACTCCCAAGC CAGGAATTTG GGGAAAGGGG TTGGAGACCA TACAAGGCAG AGGGATGAGT 
GGGGAGAAGA AAGAAGGGAG AAAGGAAAGA TGGTGTACTC ACTCATTTGG GACTCAGGAC 
^CTGCCC ACTCACTTTT TTTTTTTTTT TTTTTGAGAC AAACTTTCAC TTTTGTTGCC 
CAGGCTGGAG TGCAATGGCG CGATCTCGGC TCACTGCAAC CTCCACCTCC CGGGTTCAAG 
TGATTCTCCT GCCTCAGCCT CTAGCCAAGT AGCTGCGATT ACAGGCATGC GCCACCACGC 
CCGGCTAATT TTTGTATTTT TAGTAGAGAC GGGGTTTCGC CATGTTGGTC AGGCTGGTCT 
CGAACTCCTG ATCTCAGGTG ATCCAACCAC CCTGGCCTCC CAAAGTGCTG GGATTATAGG 
CGTGAGCCAC AGCGCCTGGC CTGAAGCAGC CACTCACTTT TACAGACCCT AAGACAATGA 
TTGCAAGCTG GTAGGATTGC TGTTTGGCCC ACCCAGCTOC GGTGTTGAGT TTGGGTGCGG 
TCTCCTGTGC TTTGCACCTG GCCCGCTTAA GGCATTTGTT ACCCGTAATG CTCCTGTAAG 
GCATCTGCGT TTGTGACATC GTTTTGGTCG CCAGGAAGGG ATTGGGGCTC TAAGCTTGAG 
CGGTTCATCC TTTTCATTTA TACAGGGGAT GACCAGAGTC ATTGGCGCTA TGGAGGTGAG 
ACACCCACCC GCTGCACAGA CCCAATCTGG GAACCCAGCT CTGTGGATCT CCCCTACAGC 
CGTCCCTGAA CACTGGTCCC GGGCGTCCCA CCCGCCGCCC ACCGTCCCAC CCCCTCACCT 
^CTACCCG GGTTCCCTAA GTTCCTGACC TAGGCGTCAG ACTTCCTCAC TATACTCTCC 
CACCCCAGGC GACCCGCCCT GGCCCCGGGT GTCCCCAGCC TGCGCGGGCC GCTTCCAGTC 



3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5160 
5220 
5280 
5340 
5400 



CCCGGTGGAT ATCCGCCCCC AGCTCGCCGC CTTCTGCCCG GCCCTGCGCC CCCTGGAACT 
CCTGGGCTTC CAGCTCCCGC CGCTCCCAGA ACTGCGCCTG CGCAACAATG GCCACAGTGG 
TGAGGGGGTC TCCCCGCCGA GACTTGGGGA TGGGGCGGGG CGCAGGGAAG GGAACCGTCG 
CGCAGTGCCT GCCCGGGGGT TGGGCTGGCC CTACCGGGCG GGGCCGGCTC ACTTGCCTCT 
CCCTACGCAG TGCAACTGAC CCTGCCTCCT GGGCTAGAGA TGGCTCTGGG TCCCGGGCGG 
GAGTACCGGG CTCTGCAGCT GCATCTGCAC TGGGGGGCTG CAGGTCGTCC GGGCTCGGAG 
CACACTGTGG AAGGCCACCG TTTCCCTGCC GAGGTGAGCG CGGACTGGCC GAGAAGGGGC 
AAAGGAGCGG GGCGGACGGG GGCCAGAGAC GTGGCCCTCT CCTACCCTCG TGTCCTTTTC 
AGATCCACGT GGTTCACCTC AGCACCGCCT TTGCCAGAGT TGACGAGGCC TTGGGGCGCC 
CGGGAGGCCT GGCCGTGTTG GCCGCCTTTC TGGAGGTACC AGATCCTGGA CACCCCCTAC 
TCCCCGCTTT CCCATCCCAT GCTCCTCCCG GACTCTATCG TGGAGCCAGA GACCCCATCC 
CAGCAAGCTC ACTCAGGCCC CTGGCTGACA AACTCATTCA CGCACTGTTT GTTCATTTAA 
CACCCACTGT GAACCAGGCA CCAGCCCCCA ACAAGGATTC TGAAGCTGTA GGTCCTTGCC 
TCTAAGGAGC CCACAGCCAG TGGGGGAGGC TGACATGACA GACACATAGG AAGGACATAG 
TAAAGATGGT GGTCACAGAG GAGGTGACAC TTAAAGCCTT CACTGGTAGA AAAGAAAAGG 
AGGTGTTCAT TGCAGAGGAA ACAGAATGTG CAAAGACTCA GAATATGGCC TATTTAGGGA 
ATGGCTACAT ACACCATGAT TAGAGGAGGC CCAGTAAAGG GAAGGGATGG TGAGATGCCT 
GCTAGGTTCA CTCACTCACT TTTATTTATT TATTTATTTT TTTGACAGTC TCTCTGTCGC 
CCAGGCTGGA GTGCAGTGGT GTGATCTTGG GTCACTGCAA CTTCCGCCTC CCGGGTTCAA 
GGGATTCTCC TGCCTCAGCT TCCTGAGTAG CTGGGGTTAC AGGTGTGTGC CACCATGCCC 
AGCTAATTTT TTTTTGTATT TTTAGTAGAC AGGGTTTCAC CATGTTGGTC AGGCTGGTCT 
CAAACTCCTG GCCTCAAGTG ATCCGCCTGA CTCAGCCTAC CAAAGTGCTG ATTACAAGTG 
TGAGCCACCG TGCCCAGCCA CACTCACTGA TTCTTTAATG CCAGCCACAC AGCACAAAGT 
TCAGAGAAAT GCCICCATCA TAGCATGTCA ATATGTTCAT ACTCTTAGGT TCATGATGTT 
CTTAACATTA GGTTCATAAG CAAAATAAGA AAAAAGAATA ATAAATAAAA GAAGTGGCAT 
GTCAGGACCT CACCTGAAAA GCCAAACACA GAATCATGAA GGTGAATGCA GAGGTGACAC 
CAACACAAAG GTGTATATAT GGTTTCCTGT GGGGAGTATG TACGGAGGCA GCAGTGAGTG 
AGACTGCAAA CGTCAGAAGG GCACGGGTCA CTGAGAGCCT AGTATCCTAG TAAAGTGGGC 



5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 



TCTCTCCCTC TCTCTCCAGC TTGTCATTGA AAACCAGTCC ACCAAGCTTG TTGGTTCGCA 7140 
CAGCAAGAGT ACATAGAGTT TGAAATAATA CATAGGATTT TAAGAGGGAG ACACTGTCTC 7200 
TAAAAAAAAA AACAACAGCA ACAACAAAAA GCAACAACCA TTACAATTTT ATGTTCCCTC 
AGCATTCTCA GAGCTGAGGA ATGGGAGAGG ACTATGGGAA CCCCCTTCAT GTTCCGGCCT 
TCAGCCATGG CCCTGGATAC ATGCACTCAT CTGTCTTACA ATGTCATTCC CCCAGGAGGG 
CCCGGAAGAA AACAGTGCCT ATGAGCAGTT GCTGTCTCGC TTGGAAGAAA TCGCTGAGGA 
AGGTCAGTTT GTTGGTCTGG CCACTAATCT CTGTGGCCTA GTTCATAAAG AATCACCCTT 
TGGAGCTTCA GGTCTGAGGC TGGAGATGGG CTCCCTCCAG TGCAGGAGGG ATTGAAGCAT 
GAGCCAGCGC TCATCTTGAT AATAACCATG AAGCTGACAG ACACAGTTAC CCGCAAACGG 
CTGCCTACAG ATTGAAAACC AAGCAAAAAC CGCCGGGCAC GGTGGCTCAC GCCTGTAATC 
CCAGCACTTT GGGAGGCCAA GGCAGGTGGA TCACGAGGTC AAGAGATCAA GACCATCCTG 
GCCAACATGG TGAAACCCCA TCTCTACTAA AAATACGAAA AAATAGCCAG GCGTGGTGGC 
GGGTGCCTGT AATCCCAGCT ACTCGGGAGG CTGAGGCAGG AGAATGGCAT GAACCCGGGA 
GGCAGAAGTT GCAGTGAGCC GAGATCGTGC CACTGCACTC CAGCCTGGGC AACAGAGCGA 
GACTCTTGTC TCAAAAAAAA AAAAAAAAAA GAAAACCAAG CAAAAACCAA AATGAGACAA 
AAAAAACAAG ACCAAAAAAT GGTGTTTGGA AATTGTCAAG GTCAAGTCTG GAGAGCTAAA 
CTTTTTCTGA GAACTGTTTA TCTTTAATAA GCATCAAATA TTTTAACTTT GTAAATACTT 
TTGTTGGAAA TCGTTCTCTT CTTAGTCACT CTTGGGTCAT TTTAAATCTC ACTTACTCTA 
CTAGACCTTT TAGGTTTCTG CTAGACTAGG TAGAACTCTG CCTTTGCATT TCTTGTGTCT 
GTTTTGTATA GTTATCAATA TTCATATTTA TTTACAAGTT ATTCAGATCA TTTTTTCTTT 
TCTTTTTTTT TTTTTTTTTT TTTTTTACAT CTTTAGTAGA GACAGGGTTT CACCATATTG 
GCCAGGCTGC TCTCAAACTC CTGACCTTGT GATCCACCAG CCTCGGCCTC CCAAAGTGCT 
GGGATTCATT TTTTCTTTTT AATTTGCTCT GGGCTTAAAC TTGTGGCCCA GCACTTTATG 
ATGGTACACA GAGTTAAGAG TGTAGACTCA GACGGTCTTT CTTCTTTCCT TCTCTTCCTT 
CCTCCCTTCC CTCCCACCTT CCCTTCTCTC CTTCCTTTCT TTCTTCCTCT CTTGCTTCCT 
CAGGCCTCTT CCAGTTGCTC CAAAGCCCTG TACTTTTTTT TGAGTTAACG TCTTATGGGA 
AGGGCCTGCA CTTAGTGAAG AAGTGGTCTC AGAGTTGAGT TACCTTGGCT TCTGGGAGGT 
GAAACTGTAT CCCTATACCC TGAAGCTTTA AGGGGGTGCA ATGTAGATGA GACCCCAACA 



7260 
7320 
7380 
7440 
7500 
7560 
7620 
7680 
7740 
7800 
7860 
7920 
7980 
8040 
8100 
8160 
8220 
8280 
8340 
8400 
8460 
8520 
8580 
8640 
8700 
8760 



TAGATCCTCT TCACAGGCTC AGAGACTCAG GTCCCAGGAC TGGACATATC TGCACTCCTG 
CCCTCTGACT TCAGCCGCTA CTTCCAATAT GAGGGGTCTC TGACTACACC GCCCTGTGCC 
CAGGGTGTCA TCTGGACTGT GTTTAACCAG ACAGTGATGC TGAGTGCTAA GCAGGTGGGC 
CTGGGGTGTG TGTGGACACA GTGGGTGCGG GGGAAAGAGG ATGTAAGATG AGATGAGAAA 
CAGGAGAAGA AAGAAATCAA GGCTGGGCTC TGTGGCTTAC GCCTATAATC CCACCACGTT 
GGGAGGCTGA GGTGGGAGAA TGGTTTGAGC CCAGGAGTTC AAGACAAGGC GGGGCAACAT 
AGTGTGACCC CATCTCTACC AAAAAAACCC CAACAAAACC AAAAATAGCC GGGCATGGTG 
GTATGCGGCC TAGTCCCAGC TACTCAAGGA GGCTGAGGTG GGAAGATCGC TTGATTCCAG 
GAGTTTGAGA CTGCAGTGAG CTATGATCCC ACCACTGCCT ACCATC^A GGATACATTT 
ATTTATTTAT AAAAGAAATC AAGAGGCTGG ATGGGGAATA CAGGAGCTGG AGGGTGGAGC 
CCTGAGGTGC TGGTTGTGAG CTGGCCTGGG ACCCTTGTTT CCTGTCATGC CATGAACCCA 
CCCACACTGT CCACTGACCT CCCXAGCTCC ACACCCTCTC TGACACCCTG TGGGGACCTG 
GTGACTCTCG GCTACAGCTG AACTTCCGAG CGACGCAGCC TTTGAATGGG CGAGTGATTG 
AGGCCTCCTT CCCTGCTGGA GTGGACAGCA GTCCTCGGGC TGCTGAGCCA GGTACAGCTT 
TGTCTGGTTT CCCCCCAGCC AGTAGTCCCT TATCCTCCCA TGTGTGTGCC AGTGTCTGTC 
ATTGGTGGTC ACAGCCCGCC TCTCACATCT OCTTTTXCTC TCCAGTCCAG CTGAATTCCT 
OCCTGGCTGC TGGTGAGTCT GCCCCTCCTC TTGGTCCTGA TGCCAGGAGA CTCCTCAGCA 
CCATTCAGCC CCAGGGCTGC TCAGGACCGC CTCTGCTCCC TCTCCTTTTC TGCAGAACAG 
ACCCCAACCC CAATATTAGA GAGGCAGATC ATGGTGGGGA TTCCCCCATT GTCCCCAGAG 
GCTAATTGAT TAGAATGAAG CTTGAGAAAT CTCCCAGCAT CCCTCTCGCA AAAGAATCCC 
CCCCCCTTTT TTTAAAGATA GGGTCTCACT CTGTTTGCCC CAGGCTGGGG TGTTGTGGCA 
CGATCATAGC TCACXGCAGC CTCGAACTCC TAGGCTCAGG CAATCCTTTC ACCTTAGCTT 
CTCAAAGCAC TGGGACTGTA GGCATGAGCC ACTGTGCCTG GCCCCAAACG GCCCTTTTAC 
TTGGCTTTTA GGAAGCAAAA ACGGTGCTTA TCTTACCCCT TCTCGTGTAT CCACCCTCAT 
CCCTTGGCTG GCCTCTTCTG GAGACTGAGG CACTATGGGG CTGCCTGAGA ACTCGGGGCA 
GGGGTGGTGG AGTGCACTGA GGCAGGTGTT GAGGAACTCT GCAGACCCCT CTTCCTTCCC 
AAAGCAGCCC TCTCTGCTCT CCATCGCAGG TGACATCCTA GCCCTGGTTT TTGGCCTCCT 
TTTTGCTGTC ACCAGCGTCG CGTTCCTTGT GCAGATGAGA AGGCAGCACA GGTATTACAC 
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TGACCCTTTC TTCAGGCACA AGCTTCCCCC ACCCTTGTGG AGTCACTTCA TGCAAAGCGC 
ATGCAAATGA GCTGCTCCTG GGCCAGTTTT CTGATTAGCC TTTCCTGTTG TGTACACACA 
GAAGGGGAAC CAAAGGGGGT GTGAGCTACC GCCCAGCAGA GGTAGCCGAG ACTGGAGCCT 
AGAGGCTGGA TCTTGGAGAA TGTGAGAAGC CAGCCAGAGG CATCTGAGGG GGAGCCGGTA 
ACTGTCCTGT CCTGCTCATT ATGCCACTTC CTTTTAACTG C C AAGAAATT TTTTAAAATA 
AATATTTATA ATAAAATATG TGTTAGTCAC CTTTGTTCCC CAAATCAGAA GGAGGTATTT 
GAATTTCCTA TTACTGTTAT TAGCACCAAT TTAGTGGTAA TGCATTTATT CTATTACAGT 
TCGGCCTCCT TCCACACATC ACTCCAATGT GTTGCTCC 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(A) DESCRIPTION: Signal peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Ala Pro Leu Cys Pro Ser Pro Trp Leu Pro Leu Leu lie Pro Ala 
1 5 10 

Pro Ala Pro Gly Leu Thr Val Gin Leu Leu Leu Ser Leu Leu Leu Leu 
20 25 30 

Met Pro Val His Pro 
35 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "primer" 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: YES 
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'(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TGGGGTTCTT GAGGATCTCC AGGAG 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii ) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "primer 

(iii) HYPOTHETICAL: NO 
( Xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
CTCTAACTTC AGGGAGCCCT CTTCTT 
(2) INFORMATION FOR SEQ ID NO : 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 48 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "primer 

(iii) HYPOTHETICAL: NO 

(ix) FEATURE: 

(D) OTHER INFORMATION: N stands for inosine 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CUACUACUAC UAGGCCACGC GTCGACTAGT ACGGGNNGGG NNGGGNNG 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 



'(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Glu Glu Asp Leu Pro Ser 
1 5 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 

(ix) FEATURE: 

(A) NAME / KEY : Peptide 

(B) LOCATION: 55. .60 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
Gly Glu Asp Asp Pro Leu 



1 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNE S S : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

oi,, niv a=!n Asd Gin Ser His Trp Arg 
Asn Asn Ala His Arg Asp Lys Glu Gly Asp Asp w. ^ 

1 5 

Tyr Gly Gly Asp Pro 
20 

(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(v) FRAGMENT TYPE: internal 



(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 36. .51 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

His Pro Gin Arg Leu Pro Arg Met Gin Glu Asp Ser Pro Leu Gly Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Glu Glu Asp Ser Pro Arg Glu Glu Asp Pro Pro Gly Glu Glu Asp Leu 
15 10 15 

Pro Gly Glu Glu Asp Leu Pro Gly 
20 

(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(v) FRAGMENT TYPE: internal 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 279. .291 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

Leu Glu Glu Gly Pro Glu Glu Asn Ser Ala Tyr Glu Gin 
15 10 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 16 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(v) FRAGMENT TYPE: internal 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Met Arg Arg Gin His Arg Arg Gly Thr Lys Gly Gly Val Ser Tyr Arg 
15 10 15 



(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
GTCGCTAGCT CCATGGGTCA TATGCAGAGG TTGCCCCGGA TGCAG 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAAGATCTCT TACTCGAGCA TTCTCCAAGA TCCAGCCTCT AGG 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: DNA (genomic) 

(A) DESCRIPTION: AP-2 transcription factor 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: 
TCCCCCACCC 
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(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: DNA (genomic) 

(A) DESCRIPTION: initiator (Inr) element 

(iii) HYPOTHETICAL: NO 

(iv) ANTISENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
CCACCCCCAT 10 



(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULAR TYPE: DNA (genomic) 

(A) DESCRIPTION: p53 binding site 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: El Deiry et al . 

(B) TITLE: "Human genomic DNA sequences define a 

consensus binding site for p53" 

(C) JOURNAL: Nature Genetics 

(D) VOLUME: 1 

(F) PAGES: 44-49 

(G) DATE: 1992 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

AAGCTAGTCC 10 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 

Leu Glu His His His His His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: Initiator consensus sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

10 

YYYCAYYYYY 



(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: p53 binding site 

(iii) HYPOTHETICAL: NO 
(iv) ANTI SENSE: NO 

(x) PUBLICATION INFORMATION: 

(A) AUTHORS: El Deiry et al . 

(B) TITLE: "Human genomic DNA sequences define a 

consensus binding site for p53" 

(C) JOURNAL: Nature Genetics 

(D) VOLUME: 1 

(F) PAGES: 44-49 

(G) DATE: 1992 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 

10 

AGGCTTGCTC 



(2) INFORMATION FOR SEQ ID NO: 25: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Ser Pro Xaa Xaa 
1 

(2) INFORMATION FOR SEQ ID NO: 26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 

Thr Pro Xaa Xaa 
1 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 540 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: Proposed MN promoter 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2 



CTTGCTTTTC 


ATTCAAGCTC 


AAGTTTGTCT 


CCCACATACC 


CATTACTTAA 


CTCACCCTCG 


60 


GGCTCCCCTA 


GCAGCCTGCC 


CTACCTCTTT 


ACCTGCTTCC 


TGGTGGAGTC 


AGGGATGTAT 


120 


ACATGAGCTG 


CTTTCCCTCT 


CAGCCAGAGG 


ACATGGGGGG 


CCCCAGCTCC 


CCTGCCTTTC 


180 


CCCTTCTGTG 


CCTGGAGCTG 


GGAAGCAGGC 


CAGGGTTAGC 


TGAGGCTGGC 


TGGCAAGCAG 


240 


CTGGGTGGTG 


CCAGGGAGAG 


CCTGCATAGT 


GCCAGGTGGT 


GCCTTGGGTT 


CCAAGCTAGT 


300 


CCATGGCCCC 


GATAACCTTC 


TGCCTGTGCA 


CACACCTGCC 


CCTCACTCCA 


CCCCCATCCT 


360 



AGCTTTGGTA TGGGGGAGAG GGCACAGGGC CAGACAAACC TGTGAGACTT TGGCTCCATC 
TCTGCAAAAG GGCGCTCTGT GAGTCAGCCT GCTCCCCTCC AGGCTTGCTC CTCCCCCACC 
CAGCTCTCGT TTCCAATGCA CGXACAGCCC GTACACACCG TGTGCTGGGA CACCCCACAG 

(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 445 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 1st MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
OCCCGTACAC ACCGTGTGCT GGGACACCCC ACAGTCAGCC GCATGGCTCC CCTGTGCCCC 
AGCCCCTGGC TCCCTCTGTT GATCCCGGCC CCTGCTCCAG GCCTCACTGT GCAACTGCTG 
CTGTCACTGC TGCTTCTGGT GCCTGTCCAT CCCCAGAGGT TGCCCCGGAT GCAGGAGGAT 
TCC CCCTTGG GAGGAGGCTC TTCTGGGGAA GATGACCCAC TGGGCGAGGA GGATCTGCCC 
AGTGAAGAGG ATTCACCCAG AGAGGAGGAT CCACCCGGAG AGGAGGATCT ACCTGGAGAG 
GAGGATCTAC CTGGAGAGGA GGATCTACCT GAAGTTAAGC CTAAATCAGA AGAAGAGGGC 
TCCCTGAAGT TAGAGGATCT ACCTACTGTT GAGGCTCCTG GAGATCCTCA AGAACCCCAG 
AATAATGCCC ACAGGGACAA AGAAG 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 2nd MN exon 
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(iii) HYPOTHETICAL: NO 



(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

30 

GGGATGACCA GAGTCATTGG CGCTATGGAG 
(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3rd MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 
GCGACCCGCC CTGGCCCCGG GTGTCCCCAG CCTGCGCGGG CCGCTTCCAG TCCCCGGTGG 
ATATCCGCCC CCAGCTCGCC GCCTTCTGCC CGGCCCTGCG CCCCCTGGAA CTCCTGGGCT 
TCCAGCTCCC GCCGCTCCCA GAACTGCGCC TGCGCAACAA TGGCCACAGT G 
(2) INFORMATION FOR SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 143 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 4th MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
TGCAACTGAC CCTGCCTCCT GGGCTAGAGA TGGCTCTGGG TCCCGGGCGG GAGTACCGGG 
CTCTGCAGCT GCATCTGCAC TGGGGGGCTG CAGGTCGTCC GGGCTCGGAG CACACTGTGG 
AAGGCCACCG TTTCCCTGCC GAG 
(2) INFORMATION FOR SEQ ID NO: 32: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 93 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5th MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
ATCCACGTGG TTCACCTCAG CACCGCCTTT GCCAGAGTTG ACGAGGCCTT GGGGCGCCCG 
GGAGGCCTGG CCGTGTTGGC CGCCTTTCTG GAG 
(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 67 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 6th MN exon 
< (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 
GAGGGCCCGG AAGAAAACAG TGCCTATGAG CAGTTGCTGT CTCGCTTGGA AGAAATCGCT 

GAGGAAG 

(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 158 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 7th MN exon 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 
GCTCAGAGAC TCAGGTCCCA GGACTGGACA TATCTGCACT CCTGCCCTCT GACTTCAGCC 
GCTACTTCCA ATATGAGGGG TCTCTGACTA CACCGCCCTG TGCCCAGGGT GTCATCTGGA 
CTGTGTTTAA CCAGACAGTG ATGCTGAGTG CTAAGCAG 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 145 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 8th MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 
CTCCACACCC TCTCTGACAC CCTGTGGGGA CCTGGTGACT CTCGGCTACA GCTGAACTTC 
CGAGCGACGC AGCCTTTGAA TGGGCGAGTG ATTGAGGCCT CCTTCCCTGC TGGAGTGGAC 
AGCAGTCCTC GGGCTGCTGA GCCAG 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 9th MN exon 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 

TCCAGCTGAA TTCCTGCCTG GCTGCTG 27 

(2) INFORMATION FOR SEQ ID NO : 37: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 82 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 10th MN exon 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 
GTGACATCCT AGCCCTGGTT TTTGGCCTCC TTTTTGCTGT CACCAGCGTC GCGTTCCTTG 
TGCAGATGAG AAGGCAGCAC AG 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 191 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 11th MN exon 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
AAGGGGAACC AAAGGGGGTG TGAGCTACCG CCCAGCAGAG GTAGCCGAGA CTGGAGCCTA 
GAGGCTGGAT CTTGGAGAAT GTGAGAAGCC AGCCAGAGGC ATCTGAGGGG GAGCCGGTAA 
CTGTCCTGTC CTGCTCATTA TGCCACTTCC TTTTAACTGC CAAGAAATTT TTTAAAATAA 
ATATTTATAA T 

(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1174 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 1st MN intron 
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(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GTAAGTGGTC ATCAATCTCC AAATCCAGGT TCCAGGAGGT TCATGACTCC CCTCCCATAC 
CCCAGCCTAG GCTCTGTTCA CTCAGGGAAG GAGGGGAGAC TGTACTCCCC ACAGAAGCCC 
TTCCAGAGGT CCCATACCAA TATCCCCATC CCCACTCTCG GAGGTAGAAA GGGACAGATG 
TGGAGAGAAA ATAAAAAGGG TGCAAAAGGA GAGAGGTGAG CTGGATGAGA TGGGAGAGAA 
GGGGGAGGCT GGAGAAGAGA AAGGGATGAG AACTGCAGAT GAGAGAAAAA ATGTGCAGAC 
AGAGGAAAAA AATAGGTGGA GAAGGAGAGT CAGAGAGTTT GAGGGGAAGA GAAAAGGAAA 
GCTTGGGAGG TGAAGTGGGT ACCAGAGACA AGCAAGAAGA GCTGGTAGAA GTCATCTCAT 
CTTAGGCTAC AATGAGGAAT TGAGACCTAG GAAGAAGGGA CACAGCAGGT AGAGAAACGT 
GGCTTCTTGA CTCCCAAGCC AGGAATTTGG GGAAAGGGGT TGGAGACCAT ACAAGGCAGA 
GGGATGAGTG GGGAGAAGAA AGAAGGGAGA AAGGAAAGAT GGTGTACTCA CTCATTTGGG 
ACTCAGGACT GAAGTGCCCA CTCACTTTTT TTTTTTTTTT TTTTGAGACA AACTTTCACT 
TTTGTTGCCC AGGCTGGAGT GCAATGGCGC GATCTCGGCT CACTGCAACC TCCACCTCCC 
GGGTTCAAGT GATTCTCCTG CCTCAGCCTC TAGCCAAGTA GCTGCGATTA CAGGCATGCG 
CCACCACGCC CGGCTAATTT TTGTATTTTT AGTAGAGACG GGGTTTCGCC ATGTTGGTCA 
GGCTGGTCTC GAACTCCTGA TCTCAGGTGA TCCAACCACC CTGGCCTCCC AAAGTGCTGG 
GATTATAGGC GTGAGC CACA GCGCCTGGCC TGAAGCAGCC ACTCACTTTT ACAGACCCTA 
AGACAATGAT TGCAAGCTGG TAGGATTGCT GTTTGGCCCA CCCAGCTGCG GTGTTGAGTT 
TGGGTGCGGT CTCCTGTGCT TTGCACCTGG CCCGCTTAAG GCATTTGTTA CCCGTAATGC 
TCCTGTAAGG CATCTGCGTT TGTGACATCG TTTTGGTCGC CAGGAAGGGA TTGGGGCTCT 
AAGCTTGAGC GGTTCATCCT TTTCATTTAT ACAG 
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(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 193 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 2nd MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
GTGAGACACC CACCCGCTGC ACAGACCCAA TCTGGGAACC CAGCTCTGTG GATCTCCCCT 
ACAGCCGTCC CTGAACACTG GTCCCGGGCG TCCCACCCGC CGCCCACCGT CCCACCCCCT 
CACCTTTTCT ACCCGGGTTC CCTAAGTTCC TGACCTAGGC GTCAGACTTC CTCACTATAC 
TCTCCCACCC CAG 

(2) INFORMATION FOR SEQ ID NO : 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 131 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3rd MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GTGAGGGGGT CTCCCCGCCG AGACTTGGGG ATGGGGCGGG GCGCAGGGAA GGGAACCGTC 
GCGCAGTGCC TGCCCGGGGG TTGGGCTGGC CCTACCGGGC GGGGCCGGCT CACTTGCCTC 
TCCCTACGCA G 

(2) INFORMATION FOR SEQ ID NO: 42: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 4th MN intron 

(iii) HYPOTHETICAL: NO 



60 
120 
180 
193 
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(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GTGAGCGCGG ACTGGCCGAG AAGGGGCAAA GGAGCGGGGC GGACGGGGGC CAGAGACGTG 
GCCCTCTCCT ACCCTCGTGT CCTTTTCAG 
(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1400 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5th MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
GTACCAGATC CTGGACACCC CCTACTCCCC GCTTTCCCAT CCCATGCTCC TCCCGGACTC 
TATCGTGGAG CCAGAGACCC CATCCCAGCA AGCTCACTCA GGCCCCTGGC TGACAAACTC 
ATTCACGCAC TGTTTGTTCA TTTAACACCC ACTGTGAACC AGGCACCAGC CCCCAACAAG 
GATTCTGAAG CTGTAGGTCC TTGCCTCTAA GGAGCCCACA GCCAGTGGGG GAGGCTGACA 
TGACAGACAC ATAGGAAGGA CATAGTAAAG ATGGTGGTCA CAGAGGAGGT GACACTTAAA 
GCCTTCACTG GTAGAAAAGA AAAGGAGGTG TTCATTGCAG AGGAAACAGA ATGTGCAAAG 
ACTCAGAATA TGGCCTATTT AGGGAATGGC TACATACACC ATGATTAGAG GAGGCCCAGT 
AAAGGGAAGG GATGGTGAGA TGCCTGCTAG GTTCACTCAC TCACTTTTAT TTATTTATTT 
ATTTTTTTGA CAGTCTCTCT GTCGCCCAGG CTGGAGTGCA GTGGTGTGAT CTTGGGTCAC 
TGCAACTTCC GCCTCCCGGG TTCAAGGGAT TCTCCTGCCT CAGCTTCCTG AGTAGCTGGG 
GTTACAGGTG TGTGCCACCA TGCCCAGCTA ATTTTTTTTT GTATTTTTAG TAGACAGGGT 
TTCACCATGT TGGTCAGGCT GGTCTCAAAC TCCTGGCCTC AAGTGATCCG CCTGACTCAG 
CCTACCAAAG TGCTGATTAC AAGTGTGAGC CACCGTGCCC AGCCACACTC ACTGATTCTT 
TAATGCCAGC CACACAGCAC AAAGTTCAGA GAAATGCCTC CATCATAGCA TGTCAATATG 
TTCATACTCT TAGGTTCATG ATGTTCTTAA CATTAGGTTC ATAAGCAAAA TAAGAAAAAA 
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89 
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GAATAATAAA 


TAAAAGAAGT 


GGCATCj 1 LAb 


HACCTCACCT 


GAAAAGCCAA 


ACACAGAATC 


960 


ATGAAGGTGA ATGCAGAGGT 


GAC AC 


PAAAGGTGTA 


TATATGGTTT 


CCTGTGGGGA 


1020 


GTATGTACGG 


AGGCAGCAGT 


GAGTGACjAL l 


nPAAACGTCA 


GAAGGGCACG 


GGTCACTGAG 


1080 


AGCCTAGTAT 


CCTAGTAAAG 


TGGGC ltlti 


fPCTCTCTCT 


CCAGCTTGTC 


ATTGAAAACC 


1140 


AGTCCACCAA 


GCTTGTTGGT 


TCGCACA^ua 


AfiAGTACATA 


GAGTTTGAAA 


TAATACATAG 


1200 


GATTTTAAGA 


GGGAGACACT 


m iti O T 1 7V A 7\ A. 

GTCTC 


A A A AAAAC AA 


CAGCAACAAC 


AAAAAGCAAC 


1260 


AACCATTACA 


ATTTTATGTT 


CCCTCAGCAT 




GAGGAATGGG 


AGAGGACTAT 


1320 


GGGAACCCCC 


TTCATGTTCC 


GGCCTTCAGC 


CATGGCCCTG 


GATACATGCA 


CTCATCTGTC 


1380 












1400 



TTACAATGTC ATTCCCCCAG 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1334 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 6th MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 



GTCAGTTTGT 


TGGTCTGGCC ; 


ACTAATCTCT < 


GTGGCCTAGT 


TCATAAAGAA 1 


TCACCCTTTG 


60 


GAGCTTCAGG 


TCTGAGGCTG 


GAGATGGGCT 


CCCTCCAGTG 


CAGGAGGGAT 


TGAAGCATGA 


120 


GCCAGCGCTC 


ATCTTGATAA 


TAACCATGAA 


GCTGACAGAC 


ACAGTTACCC 


GCAAACGGCT 


180 


GCCTACAGAT 


TGAAAACCAA 


GCAAAAACCG 


CCGGGCACGG 


TGGCTCACGC 


CTGTAATCCC 


240 


AGCACTTTGG 


GAGGCCAAGG 


CAGGTGGATC 


ACGAGGTCAA 


GAGATCAAGA 


CCATCCTGGC 


300 


CAACATGGTG 


AAACCCCATC 


TCTACTAAAA 


ATACGAAAAA 


ATAGCCAGGC 


GTGGTGGCGG 


360 


GTGCCTGTAA 


TCCCAGCTAC 


TCGGGAGGCT 


GAGGCAGGAG 


AATGGCATGA 


ACCCGGGAGG 


420 


CAGAAGTTGC 


AGTGAGCCGA 


GATCGTGCCA 


CTGCACTCCA 


GCCTGGGCAA 


CAGAGCGAGA 


480 


CTCTTGTCTC 


AAAAAAAAAA 


AAAAAAAAGA 


AAACCAAGCA 


AAAAC CAAAA 


TGAGACAAAA 


540 


AAAACAAGAC 


CAAAAAATGG 


TGTTTGGAAA 


TTGTCAAGGT 


CAAGTCTGGA 


GAGCTAAACT 


600 



/TTTTCTGAGA 


ACTGTTTATL 


rnmrn 7\7\ I T , 7\7\ 


ATPAAATATT 


TTAACTTTGT 


AAATACTTTT 


660 


GTTGGAAATC 


GTTCTCTTCT 


TALr 1 LAL 1L1 


TP^tnTPATTT 

X \J\J\J X xxx 


TAAATCTCAC 


TTACTCTACT 


720 


AGACCTTTTA 


GGTTTCTGCT 


ACjAL 1 ALrkj 1 A 


p,AAPTPTGCC 


TTTGCATTTC 


TTGTGTCTGT 


780 


TTTGTATAGT 


TATCAATATT 


r* a t a r r r p r r a tt 


TAPAAGTTAT 


TCAGATCATT 


TTTTCTTTTC 


840 


TTTTTTTTTX 


r 1 * ~ r ^ 


rrirprnrp "A r~\ 71 rp/^T 1 

TT1 1ALA1L1 


TTZ\P.TAf^AGA 


CAGGGTTTCA 


CCATATTGGC 


900 


CAGGCTGCTC 


TCAAACTCCT 


GALL 1 1 <j 1 vjA 




TCGGCCTCCC 


AAAGTGCTGG 


960 


GATTCATTTT 


TTCTTTTTAA 


111 IjtL 1 L 1 


OPTTAAAPTT 


GTGGCCCAGC 


ACTTTATGAT 


1020 


GGTACACAGA 


GTTAAGAGTG 


1 ALAL 1 LALj/\ 


PPHTPTTTCT 

V^OVJ X \-» XXX x 


TCTTTCCTTC 


TCTTCCTTCC 


1080 


TCCCTTCCCT 


CCCACCTTCL 


LI 1L1L1 1 


TPPTTTCTTT 

X V- X X XV— X X x 


CTTCCTCTCT 


TGCTTCCTCA 


1140 


GGCCTCTTCC 


AGTTGCTCCA 


AALjLLL 1 \j 1 A 


PTTTTTTTTG 

V^XXXXXXXXVJ 


AGTTAACGTC 


TTATGGGAAG 


1200 


GGCCTGCACT 


TAGTGAAGAA 


f TP P T 1 P TP A P 
Cj 1 KjKj 1 L 1 L AtyJ 


a PTTH AGTT A 

X X ufivj x x -Ei. 


CCTTGGCTTC 


TGGGAGGTGA 


1260 


AACTGTATCC 


CTATACCCTG 


AAGCTTTAAG 


GGGGTGCAAT 


GTAGATGAGA 


CCCCAALA1A 


-L J £ U 


GATCCTCTTC 


ACAG 










1334 



(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 512 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 7th MN intron 



(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



GTGGGCCTGG 


GGTGTGTGTG 


GACACAGTGG 


GTGCGGGGGA 


AAGAGGATGT 


AAGATGAGAT 


60 


GAGAAACAGG 


AGAAGAAAGA 


AATCAAGGCT 


GGGCTCTGTG 


GCTTACGCCT 


ATAATCCCAC 


120 


CACGTTGGGA 


GGCTGAGGTG 


GGAGAATGGT 


TTGAGCCCAG 


GAGTTCAAGA 


CAAGGCGGGG 


180 


CAACATAGTG 


TGACCCCATC 


TCTACCAAAA 


AAACCCCAAC 


AAAACCAAAA 


ATAGCCGGGC 


240 


ATGGTGGTAT 


GCGGCCTAGT 


CCCAGCTACT 


CAAGGAGGCT 


GAGGTGGGAA 


GATCGCTTGA 


300 


TTCCAGGAGT 


TTGAGACTGC 


AGTGAGCTAT 


GATCCCACCA 


CTGCCTACCA 


TCTTTAGGAT 


360 



ACATTTATTT ATTTATAAAA GAAATCAAGA GGCTGGATGG GGAATACAGG AGCTGGAGGG 
TGGAGCCCTG AGGTGCTGGT TGTGAGCTGG CCTGGGACCC TTGTTTCCTG TCATGCCATG 
AACCCACCCA CACTGTCCAC TGACCTCCCT AG 
(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 114 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 8th MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

( Xi ) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 
GTACAGCTTT GTCTGGTTTC CCCCCAGCCA GTAGTCCCTT ATCCTCCCAT GTGTGTGCCA 
GTGTCTGTCA TTGGTGGTCA CAGCCCGCCT CTCACATCTC CTTTTTCTCT CCAG 
(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 617 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 9th MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
GTGAGTCTGC CCCTCCTCTT GGTCCTGATG CCAGGAGACT CCTCAGCACC ATTCAGCCCC 
AGGGCTGCTC AGGACCGCCT CTGCTCCCTC TCCTTTTCTG CAGAACAGAC CCCAACCCCA 
ATATTAGAGA GGCAGATCAT GGTGGGGATT CCCCCATTGT CCCCAGAGGC TAATTGATTA 
GAATGAAGCT TGAGAAATCT CCCAGCATCC CTCTCGCAAA AGAATCCCCC CCCCTTTTTT 
TAAAGATAGG GTCTCACTCT GTTTCCCCCA GGCTGGGGTG TTGTGGCACG ATCATAGCTC 



420 
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ACTGCAGCCT CGAACTCCTA GGCTCAGGCA ATCCTTTCAC CTTAGCTTCT CAAAGCACTG 
GGACTGTAGG CATGAGCCAC TGTGCCTGGC CCCAAACGGC CCTTTTACTT GGCTTTTAGG 
AAGCAAAAAC GGTGCTTATC TTACCCCTTC TCGTGTATCC ACCCTCATCC CTTGGCTGGC 
CTCTTCTGGA GACTGAGGCA CTATGGGGCT GCCTGAGAAC TCGGGGCAGG GGTGGTGGAG 
TGCACTGAGG CAGGTGTTGA GGAACTCTGC AGACCCCTCT TCCTTCCCAA AGCAGCCCTC 
TCTGCTCTCC ATCGCAG 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 10th MN intron 
(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GTATTACACT GACCCTTTCT TCAGGCACAA GCTTCCCCCA CCCTTGTGGA GTCACTTCAT 
GCAAAGCGCA TGCAAATGAG CTGCTCCTGG GCCAGTTTTC TGATTAGCCT TTCCTGTTGT 
GTACACACAG 

(2) INFORMATION FOR SEQ ID NO: 49: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: Spans 3' part of 1st intron to beyond 

end of 5th exon 

(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 
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(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 49: 
CAAACTTTCA CTTTTGTTGC CCAGGCTGGA GTGCAATGGC GCGATCTCGG CTCACTGCAA 
CCTCCACCTC CCGGGTTCAA GTGATTCTCC TGCCTCAGCC TCTAGCCAAG TAGCTGCGAT 
TACAGGCATG CGCCACCACG CCCGGCTAAT TTTTGTATTT TTAGTAGAGA CGGGGTTTCG 
CCATGTTGGT CAGGCTGGTC TCGAACTCCT GATCTCAGGT GATCCAACCA CCCTGGCCTC 
CCAAAGTGCT GGGATTATAG GCGTGAGCCA CAGCGCCTGG CCTGAAGCAG CCACTCACTT 
TTACAGACCC TAAGACAATG ATTGCAAGCT GGTAGGATTG CTGTTTGGCC CACCCAGCTG 
CGGTGTTGAG TTTGGGTGCG GTCTCCTGTG CTTTGCACCT GGCCCGCTTA AGGCATTTGT 
TACCCGTAAT GCTCCTGTAA GGCATCTGCG TTTGTGACAT CGTTTTGGTC GCCAGGAAGG 
GATTGGGGCT CTAAGCTTGA GCGGTTCATC CTTTTCATTT ATACAGGGGA TGACCAGAGT 
CATTGGCGCT ATGGAGGTGA GACACCCACC CGCTGCACAG ACCCAATCTG GGAACCCAGC 
TCTGTGGATC TCCCCTACAG CCGTCCCTGA ACACTGGTCC CGGGCGTCCC ACCCGCCGCC 
CACCGTCCCA CCCCCTCACC TTTTCTACCC GGGTTCCCTA AGTTCCTGAC CTAGGCGTCA 
GACTTCCTCA CTATACTCTC CCACCCCAGG CGACCCGCCC TGGCCCCGGG TGTCCCCAGC 
CTGCGCGGGC CGCTTCCAGT CCCCGGTGGA TATCCGCCCC CAGCTCGCCG CCTTCTGCCC 
GGCCCTGCGC CCCCTGGAAC TCCTGGGCTT CCAGCTCCCG CCGCTCCCAG AACTGCGCCT 
GCGCAACAAT GGCCACAGTG GTGAGGGGGT CTCCCCGCCG AGACTTGGGG ATGGGGCGGG 
GCGCAGGGAA GGGAACCGTC GCGCAGTGCC TGCCCGGGGG TTGGGCTGGC CCTACCGGGC 
GGGGCCGGCT CACTTGCCTC TCCCTACGCA GTGCAACTGA CCCTGCCTCC TGGGCTAGAG 
ATGGCTCTGG GTCCCGGGCG GGAGTACCGG GCTCTGCAGC TGCATCTGCA CTGGGGGGCT 
GCAGGTCGTC CGGGCTCGGA GCACACTGTG GAAGGCCACC GTTTCCCTGC CGAGGTGAGC 
GCGGACTGGC CGAGAAGGGG CAAAGGAGCG GGGCGGACGG GGGCCAGAGA CGTGGCCCTC 
TCCTACCCTC GTGTCCTTTT CAGATCCACG TGGTTCACCT CAGCACCGCC TTTGCCAGAG 
TTGACGAGGC CTTGGGGCGC CCGGGAGGCC TGGCCGTGTT GGCCGCCTTT CTGGAGGTAC 
CAGATCCTGG ACACCCCCTA C 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



DESCRIPTION: Region of homology to collagen alpha 



(A) 

1 chain 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Gin Arg Leu Pro Arg Met Gin Glu Asp Ser Pro Leu Gly Gly Gly Ser 
1 5 10 15 

Ser Gly Glu Asp Asp Pro Leu Gly Glu Glu Asp Leu Pro Ser Glu Glu 
20 25 30 

Asp Ser Pro Arg Glu Glu Asp Pro Pro Gly Glu Glu Asp Leu Pro Gly 
35 40 45 



Glu Glu Asp Leu Pro Gly Glu Glu Asp Leu Pro Glu Val Lys Pro Lys 
50 55 60 

Ser Glu Glu Glu Gly Ser Leu Lys Leu Glu Asp Leu Pro Thr Val Glu 



Ala Pro Gly Asp Pro Gin Glu Pro Gin Asn Asn Ala His Arg Asp Lys 

85 9° 95 

Glu Gly 



INFORMATION FOR SEQ ID NO : 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 256 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(A) DESCRIPTION: carbonic anhydrase domain 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

Asp Asp Gin Ser His Trp Arg Tyr Gly Gly Asp Pro Pro Trp Pro Arg 
1 5 10 15 

Val Ser Pro Ala Cys Ala Gly Arg Phe Gin Ser Pro Val Asp lie Arg 



20 



25 



Pro Gin Leu Ala Ala Phe Cys Pro Ala Leu Arg Pro Leu Glu Leu Leu 



35 



40 



45 



Gly Phe Gin Leu Pro Pro Leu Pro Glu Leu Arg Leu Arg Asn Asn Gly 

c;^ 60 



50 55 

His Ser Val Gin Leu Thr Leu Pro Pro Gly Leu Glu Met Ala Leu Gly 
65 70 75 80 



Pro Gly Arg Glu Tyr Arg Ala Leu Gin Leu His Leu His Trp Gly Ala 

Ala Gly Arg Pro Gly Ser Glu His Thr Val Glu Gly His Arg Phe Pro 
100 105 HO 

Ala Glu lie His Val Val His Leu Ser Thr Ala Phe Ala Arg Val Asp 
115 120 I 25 

Glu Ala Leu Gly Arg Pro Gly Gly Leu Ala Val Leu Ala Ala Phe Leu 
130 135 140 

Glu Glu Gly Pro Glu Glu Asn Ser Ala Tyr Glu Gin Leu Leu Ser Arg 
145 150 155 160 

Leu Glu Glu lie Ala Glu Glu Gly Ser Glu Thr Gin Val Pro Gly Leu 

165 I 70 175 

Asp He Ser Ala Leu Leu Pro Ser Asp Phe Ser Arg Tyr Phe Gin Tyr 
180 185 I 90 

Glu Gly Ser Leu Thr Thr Pro Pro Cys Ala Gin Gly Val lie Trp Thr 
195 200 205 

Val Phe Asn Gin Thr Val Met Leu Ser Ala Lys Gin Leu His Thr Leu 
210 215 220 

Ser Asp Thr Leu Trp Gly Pro Gly Asp Ser Arg Leu Gin Leu Asn Phe 

230 235 ^ u 



225 



Arg Ala Thr Gin Pro Leu Asn Gly Arg Val lie Glu Ala Ser Phe Pro 
a 245 250 255 

(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(A) DESCRIPTION: transmembrane region 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

lie Leu Ala Leu Val Phe Gly Leu Leu Phe Ala Val Thr Ser Val Ala 
1 5 10 15 

Phe Leu Val Gin 
20 

(2) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNE S S : 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(A) DESCRIPTION: intracellular C- terminus 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

Met Arg Arg Gin His Arg Arg Gly Thr Lys Gly Gly Val Ser Tyr Arg 

Pro Ala Glu Val Ala Glu Thr Gly Ala 
20 25 

INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Arg Ala Leu Gin Leu His Leu His Trp Gly Ala Ala Gly Arg Pro Gly 
1 5 1° 15 

Ser Glu His Thr Val Glu Gly His Arg Phe Pro Ala Glu lie His Val 
20 25 30 

Val His Leu Ser Thr Ala Phe Ala Arg Val Asp Glu Ala Leu Gly Arg 
35 40 45 

Pro Gly Gly Leu Ala Val Leu Ala Ala Phe Leu Glu Glu Gly Pro Glu 
50 55 60 

Glu Asn Ser Ala Tyr Glu Gin Leu Leu Ser Arg Leu Glu Glu lie Ala 
65 70 75 

Glu Glu Gly Ser Glu Thr Gin Val Pro Gly Leu Asp lie Ser Ala Leu 

85 90 95 

Leu Pro Ser Asp Phe Ser Arg Tyr Phe Gin Tyr Glu Gly Ser Leu Thr 
100 i° 5 110 

Thr Pro Pro Cys Ala Gin Gly Val lie Trp Thr Val Phe Asn Gin Thr 
115 120 125 

Val Met Leu Ser Ala Lys Gin Leu His Thr Leu Ser Asp Thr Leu Trp 
130 135 140 



Gly Pro Gly Asp Ser Arg Leu Gin Leu Asn Phe Arg Ala Thr Gin Pro 

150 155 



145 



Leu Asn Gly Arg Val He Glu Ala Ser Phe 



165 I 70 



(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: RNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 
CAUGGCCCCG AUAACCUUCU GCCUGUGCAC ACACCUGCCC CUCACUCCAC CCCCAUCCUA 
GCUUUGGUAU GGGGGAGAGG GCACAGGGCC AGACAAACCU GUGAGACUUU GGCUCCAUCU 
CUGCAAAAGG GCGCUCUGUG AGUCAGCCUG CUCCCCUCCA GGCUUGCUCC UCCCCCACCC 
AGCUCUCGUU UCCAAUGCAC GUACAGCCCG UACACACCGU GUGCUGGGAC ACCCCACAGU 
CAGCCGCAUG GCUCCCCUGU GCCCCAGCCC CUGGCUCCCU CUGUUGAUCC CGGCCCCUGC 
UCCAGGCCUC ACUGUGCAAC UGCUGCUGUC ACUGCUGCUU CUGGUGCCUG UCCAUCCCCA 
GAGGUUGCCC CGGAUGCAGG AGGAUUCCCC CUUGGGAGGA GGCUCUUCUG GGGAAGAUGA 
CCCACUGGGC GAGGAGGAUC UGCCCAGUGA AGAGGAUUCA CCCAGAGAGG 
(2) INFORMATION FOR SEQ ID NO: 56: 
(i) SEQUENCE CHARACTERISTICS: 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

This sequence is intentionally skipped. 

(2) INFORMATION FOR SEQ ID NO: 57: 
(i) SEQUENCE CHARACTERISTICS: 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 57: 

This sequence is intentionally skipped. 

(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 904 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 
GCTGGTCTCG AACTCCTGGA CTCAAGCAAT CCACCCACCT CAGCCTCCCA AAATGAGGGA 
CCGTGTCTTA TTCATTTCCA TGTCCCTAGT CCATAGCCCA GTGCTGGACC TATGGTAGTA 
CTAAATAAAT ATTTGTTGAA TGCAATAGTA AATAGCATTT CAGGGAGCAA GAACTAGATT 
AACAAAGGTG GTAAAAGGTT TGGAGAAAAA AATAATAGTT TAATTTGGCT AGAGTATGAG 
GGAGAGTAGT AGGAGACAAG ATGGAAAGGT CTCTTGGGCA AGGTTTTGAA GGAAGTTGGA 
AGTCAGAAGT ACACAATGTG CATATCGTGG CAGGCAGTGG GGAGCCAATG AAGGCTTTTG 
AGCAGGAGAG TAATGTGTTG AAAAATAAAT ATAGGTTAAA CCTATCAGAG CCCCTCTGAC 
ACATACACTT GCTTTTCATT CAAGCTCAAG TTTGTCTCCC ACATACCCAT TACTTAACTC 
ACCCTCGGGC TCCCCTAGCA GCCTGCCCTA CCTCTTTACC TGCTTCCTGG TGGAGTCAGG 
GATGTATACA TGAGCTGCTT TCCCTCTCAG CCAGAGGACA TGGGGGGCCC CAGCTCCCCT 
GCCTTTCCCC TTCTGTGCCT GGAGCTGGGA AGCAGGCCAG GGTTAGCTGA GGCTGGCTGG 
CAAGCAGCTG GGTGGTGCCA GGGAGAGCCT GCATAGTGCC AGGTGGTGCC TTGGGTTCCA 
AGCTAGTCCA TGGCCCCGAT AACCTTCTGC CTGTGCACAC ACCTGCCCCT CACTCCACCC 
CCATCCTAGC TTTGGTATGG GGGAGAGGGC ACAGGGCCAG ACAAACCTGT GAGACTTTGG 
CTCCATCTCT GCAAAAGGGC GCTCTGTGAG TCAGCCTGCT CCCCTCCAGG CTTGCTCCTC 
CCCC 

(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
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(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 
TTTTTTTGAG ACGGAGTCTT GCATCTGTCA TGCCCAGGCT GGAGTAGCAG TGGTGCCATC 
TCGGCTCACT GCAAGCTCCA CCTCCCGAGT TCACGCCATT TTCCTGCCTC AGCCTCCCGA 
GTAGCTGGGA CTACAGGCGC CCGCCACCAT GCCCGGCTAA TTTTTTGTAT TTTTGGTAGA 
GACGGGGTTT CACCGTGTTA GCCAGAATGG TCTCGATCTC CTGACTTCGT GATCCACCCG 
CCTCGGCCTC CCAAAGTTCT GGGATTACAG GTGTGAGCCA CCGCACCTGG CC 
(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 262 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 
TTCTTTTTTG AGACAGGGTC TTGCTCTGTC ACCCAGGCCA GAGTGCAATG GTACAGTCTC 
AGCTCACTGC AGCCTCAACC GCCTCGGCTC AAACCATCAT CCCATTTCAG CCTCCTGAGT 
AGCTGGGACT ACAGGCACAT GCCATTACAC CTGGCTAATT TTTTTGTATT TCTAGTAGAG 
ACAGGGTTTG GCCATGTTGC CCGGGCTGGT CTCGAACTCC TGGACTCAAG CAATCCACCC 
ACCTCAGCCT CCCAAAATGA GG 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 294 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 
TTTTTTTTTG AGACAAACTT TCACTTTTGT TGCCCAGGCT GGAGTGCAAT GGCGCGATCT 
CGGCTCACTG CAACCTCCAC CTCCCGGGTT CAAGTGATTC TCCTGCCTCA GCCTCTAGCC 
AAGTAGCTGC GATTACAGGC ATGCGCCACC ACGCCCGGCT AATTTTTGTA TTTTTAGTAG 180 
AGACGGGGTT TCGCCATGTT GGTCAGGCTG GTCTCGAACT CCTGATCTCA GGTGATCCAA 
CCACCCTGGC CTCCCAAAGT GCTGGGATTA TAGGCGTGAG CCACAGCGCC TGGC 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 276 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 
TGACAGTCTC TCTGTCGCCC AGGCTGGAGT GCAGTGGTGT GATCTTGGGT CACTGCAACT 
TCCGCCTCCC GGGTTCAAGG GATTCTCCTG CCTCAGCTTC CTGAGTAGCT GGGGTTACAG 120 
GTGTGTGCCA CCATGCCCAG CTAATTTTTT TTTGTATTTT TAGTAGACAG GGTTTCAC CA 180 
TGTTGGTCAG GCTGGTCTCA AACTCCTGGC CTCAAGTGAT CCGCCTGACT CAGCCTACCA 24 0 

AAGTGCTGAT TACAAGTGTG AGCCACCGTG CCCAGC 
(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 289 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 
CGCCGGGCAC GGTGGCTCAC GCCTGTAATC CCAGCACTTT GGGAGGCCAA GGCAGGTGGA 



60 



276 



60 



TCACGAGGTC AAGAGATCAA GACCATCCTG GCCAACATGG TGAAACCCCA TCTCTACTAA 
AAATACGAAA AAATAGCCAG GCGTGGTGGC GGGTGCCTGT AATCCCAGCT ACTCGGGAGG 
CTGAGGCAGG AGAATGGCAT GAACCCGGGA GGCAGAAGTT GCAGTGAGCC GAGATCGTGC 
CACTGCACTC CAGCCTGGGC AACAGAGCGA GACTCTTGTC TCAAAAAAA 
(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 

(iv) ANT I -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
AGGCTGGGCT CTGTGGCTTA CGCCTATAAT CCCACCACGT TGGGAGGCTG AGGTGGGAGA 
ATGGTTTGAG CCCAGGAGTT CAAGACAAGG CGGGGCAACA TAGTGTGACC CCATCTCTAC 
CAAAAAAACC CCAACAAAAC CAAAAATAGC CGGGCATGGT GGTATGCGGC CTAGTCCCAG 
CTACTCAAGG AGGCTGAGGT GGGAAGATCG CTTGATTCCA GGAGTTTGAG ACTGCAGTGA 
GCTATGATCC CACCACTGCC TACCATCTTT AGGATACATT TATTTATTTA TAAAAGAA 
(2) INFORMATION FOR SEQ ID NO : 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 105 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(iii) HYPOTHETICAL: NO 
(iv) ANT I -SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 
TTTTTTACAT CTTTAGTAGA GACAGGGTTT CACCATATTG GCCAGGCTGC TCTCAAACTC 
CTGACCTTGT GATCCACCAG CCTCGGCCTC CCAAAGTGCT GGGAT 
(2) INFORMATION FOR SEQ ID NO: 66: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 83 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 
CCTCGAACTC CTAGGCTCAG GCAATCCTTT CACCTTAGCT TCTCAAAGCA CTGGGACT 
AGGCATGAGC CACTGTGCCT GGC 
(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 
AGAAGGTAAG T 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 
TGGAGGTGAG A 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 



(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 
CAGTCGTGAG G 

(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 
CCGAGGTGAG C 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 
TGGAGGTACC A 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 
GGAAGGTCAG T 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 
AGCAGGTGGG C 

(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 74: 
GCCAGGTACA G 

(2) INFORMATION FOR SEQ ID NO : 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5' donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
TGCTGGTGAG T 

(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 base pairs 



(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 5" donor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 
ATACAGGGGAT 

(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 
ATACAGGGGA T 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 
CCCCAGGCGA C 

(2) INFORMATION FOR SEQ ID NO : 79: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 
ACGCAGTGCA A 

(2) INFORMATION FOR SEQ ID NO : 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 
TTTCAGATCC A 

(2) INFORMATION FOR SEQ ID NO: 81: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 
CCCCAGGAGG G 

(2) INFORMATION FOR SEQ ID NO: 82: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 
TCACAGGCTC A 

(2) INFORMATION FOR SEQ ID NO: 83: 
(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 
CCCTAGCTCC A 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 
CTCCAGTCCA G 

(2) INFORMATION FOR SEQ ID NO: 85: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3 f acceptor consensus splice sequence 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 
TCGCAGGTGA CA 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(A) DESCRIPTION: 3' acceptor consensus splice sequence 



^ >•* (xi) SEQUENCE DESCRIPTION 
ACACAGAAGG G 



